1 Introdução

Ggplot é chamado de gramática dos gráficos. Há sete elementos nessa gramática. Imagine uma frase. Cada parte tem sua função gramatical que pretende transmitir uma determinada mensagem.

No livro The Grammar of Graphics, Leland Wilkinson, traz dois princípios:

  • Gráficos são construídos por distintas camadas de elementos gráficos;

  • Insights significativos são construídos com ‘aesthetic mapping’

Há 7 elementos dessa gramática visual:

O jargão de cada elemento pode ser visto abaixo:

Para este curso vamos considerar os seguintes elementos:

1.1 Mtcars: descrição

# Load the ggplot2 package
# require(ggplot2)

# Explore the mtcars data frame with str()
str(mtcars)
## 'data.frame':    32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
# Vamos criar alguns fatores na base para criar visualizações mais atraentes
# Using am as a factor
mtcars_mod <- 
  mtcars %>% 
  mutate(fam = factor(am), 
         fcyl = factor(cyl))

levels(mtcars_mod$fam) <- c("automatic", "manual")

Dicionário:

matcars é um quadro de dados contendo 32 objetos e 11 variáveis, incluindo mpg,cyl,disp,hp ,drat,wt,qsec,vs,am,gear,carb.

  1. mpg: Milhas/(EUA) galão

  2. cyl: Número de cilindros

  3. disp: Deslocamento

  4. hp: potência bruta

  5. drat: Relação do eixo traseiro

  6. wt: Peso (1000 lbs)

  7. qseg: tempo de 1/4 de milha

  8. vs: V/S

  9. am: (0=automática, 1=manual)

  10. gear: Número de marchas para a frente

  11. carb: Número de carburadores

1.2 Gráficos

1.2.1 Basic: Consumo vs. Cilindrada

# Execute the following command
ggplot(mtcars, aes(cyl, mpg)) +
  geom_point()

1.2.2 Color: Peso vs. Consumo

# Change the color aesthetic to a size aesthetic
ggplot(mtcars, aes(wt, mpg, color = disp)) +
  geom_point()

1.2.3 Size: Tamanho: Por autonomia

# Change the color aesthetic to a size aesthetic
ggplot(mtcars_mod, aes(wt, mpg, size = disp)) +
  geom_point()

1.2.4 Shape: Tamanho: Por autonomia

ggplot(mtcars_mod, aes(wt, mpg, shape = fam)) +
  geom_point()

1.3 Ggplot e suas camadas

Temos 4 camadas para trabalhar:

  1. Data: a tabela de dados que queremos construir um gráfico;
  2. Aesthetics: São os elementos de coordenadas;
  3. Geometries: É o tipo de gráfico;
  4. Theme: É o tema que escolheremos para o gráfico.

De outro modo, temos os dados (data), precisamos mapeá-lo (aesthetics) e escolher o melhor tipo de gráfico (geometries) e pensar no melhor tema (theme).

1.3.1 Gráfico usando smooth

ggplot(diamonds, aes(carat, price)) +
  geom_point() +
  geom_smooth()

1.3.2 Gráfico usando alpha

# Make the points 40% opaque
ggplot(diamonds, aes(carat, price, color = clarity)) +
  geom_point(alpha = 0.4) +
  geom_smooth()

1.3.3 Gráfico sem smooth

# From previous step
plt_price_vs_carat <- ggplot(diamonds, aes(carat, price))

# Edit this to map color to clarity,
# Assign the updated plot to a new object
plt_price_vs_carat_by_clarity <- 
plt_price_vs_carat + geom_point(aes(color = clarity))

# See the plot
plt_price_vs_carat_by_clarity

1.3.4 Com alpha e shape

# Plot price vs. carat, colored by clarity
plt_price_vs_carat_by_clarity <- ggplot(diamonds, aes(carat, price, color = clarity))

# Set transparency to 0.5 and use shape = 16
plt_price_vs_carat_by_clarity + geom_point(alpha = 0.5, shape = 16)

1.3.5

2 Aesthetics

2.1 Visible aesthetics

Aesthetic Description
x X axis position
y Y axis position
fill fill color
color color of points, outlines of other geoms
size area or radius of points, thickness of line
alpha transparency
linetype line dash pattern
labels text on a plot or axes
shape shape of the points

Veja os gráficos variando cada um dos atributos:

ggplot(mtcars, aes(wt, mpg, color = disp)) +
  # Set the shape and size of the points
  geom_point(shape = 1,  size = 4)

# Map color to fam
ggplot(mtcars, aes(wt, mpg, fill = disp)) +
  geom_point(shape = 21, size = 4, alpha = 0.6)

Vamos adicionar color não muda em nada as cores, observe:

# Map color to fam
ggplot(mtcars, aes(wt, mpg, fill = disp, color = gear)) +
  geom_point(shape = 21, size = 4, alpha = 0.6)

Codificação em forma! Observe que mapear uma variável categórica para preenchimento não altera as cores, embora uma legenda seja gerada! Isso ocorre porque a forma padrão para pontos possui apenas um atributo de cor e não um atributo de preenchimento! Use preenchimento quando tiver outra forma (como uma barra) ou ao usar um ponto que tenha um preenchimento e um atributo de cor, como shape= 21, que é um círculo com um contorno. Sempre que você usar uma cor sólida, certifique-se de usar a mistura com alpha para compensar o excesso de plotagem.

2.1.1 Label no lugar do Plot

# Base layer
plt_mpg_vs_wt <- ggplot(mtcars, aes(wt, mpg))

# Use text layer and map fcyl to label
plt_mpg_vs_wt +
  geom_text(aes(label = gear))

2.1.2 Atributos vs. aesthetics

Vamos agora usar o atributo “color” fora do aesthetics. Veja a diferença:

# A hexadecimal color
my_blue <- "#4ABEFF"

ggplot(mtcars, aes(wt, mpg)) +
  # Set the point color and alpha
  geom_point(color = my_blue, alpha = 0.6)

Observe que o atributo é um valor constante que vale para todos os pontos e é colocado fora do aes(). Assim fazemos a mudança em todos os pontos.

2.1.3 Shape com Size

Podemos usar os fatores em shape para criar pontos personalizados. Veja:

# 5 aesthetics: add a mapping of size to hp / wt
ggplot(mtcars_mod, aes(mpg, qsec, color = disp, shape = fam, size = hp/wt)) +
  geom_point()

2.1.4 Positions and color

palette <- c(automatic = "#377EB8", manual = "#E41A1C")

# Set the position
ggplot(mtcars_mod, aes(cyl, fill = fam)) +
  geom_bar(position = 'dodge') +
  labs(x = "Number of Cylinders", y = "Count")+
  scale_fill_manual("Transmission", values = palette)

2.1.5 Scatter com y =0

# Plot 0 vs. mpg
ggplot(mtcars, aes(x=mpg, y=0)) +
  # Add jitter 
  geom_point(position = 'jitter') + 
  ylim(-1,1)

O que foi feito aqui? What is y?

2.1.6 Aesthetics: melhores práticas

REGRA DE OURO: A FORMA SEGUE A FUNÇÃO!!!

Função:

  1. Primário: Representações precisas e eficientes;
  2. Secundário: enredos visualmente atraentes e bonitos.

Nunca:

  • Nunca deturpe o dado ou torne pouco claro os dados;

  • Confunda a audiência com complexidade.

Melhores escolhas para aesthetics:

  • Eficiência: Escolha uma visualização fácil e rápida no lugar de sumários numéricos;

  • Acurácia: Minimize a perda de informação.

Por exemplo, Cuidado com a escala dos gráficos comparáveis. Isso pode confundir o leitor e levá-lo a intepretação. Use jitter e o alpha para melhorar a visualização. Pode ser que haja muitos dados repetidos que se sobrepõe e necessitam de técnicas específicas para serem visualizados. Caso utilize de forma errada, o mapeamento do aesthetics incorreto causa confusão ou engana o público.

A representação numa coordenada x e y precisa levar em consideração alguns padrões. Por exemplo, normalmente, a variável dependente é mapeada no eixo y e a variável independente é mapeada no eixo x. Seguindo algumas melhores práticas é possível ganhar muita informação e de forma relativamente simples.

A seguir vamos ver algumas técnicas que nos ajudam a transformar nossos gráficos em representam visuais mais fidedignas da realidade dos dados.

2.1.7 Média dentro do gráfico

iris %>%  
  group_by(Species) %>%   
  summarise_all(mean) -> iris.summary

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) +
  # Inherits both data and aes from ggplot()   
  geom_point() + 
  # Different data, but inherited aes  
  geom_point(data = iris.summary, shape = 15, size = 5)

Com shapes diferentes:

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) +   geom_point() +  
  geom_point(data = iris.summary, 
             shape = 21, 
             size = 5,              
             fill = "black", 
             stroke = 2)

Veja os atributos do ‘shape’ e seus valores.

3 Geometric

Vamos estudar agora os tipos de geometries (geom_*) que podem ser usados. Há 48 tipos de geometries, embora alguns possam ser redundantes.

Cada geom aceita aesthetics específicos. Temos os parâmetros essenciais e os opcionais. Vamos ver o scatter plot.

  • geom_point()

    • essencial: x, y

    • optional: alpha, color, fill, shape, size, stroke.

Vamos percorrer alguns exemplos mostrando com se comporta a inclusão desses elementos para mudar algumas elementos gráficos.

3.0.1 Comparation of jitter and dodge

par(mfrow=c(2,2))

# G1
# Plot base
plt_mpg_vs_fcyl_by_fam <- 
ggplot(mtcars_mod, aes(fcyl, mpg, color = fam))

# Default points are shown for comparison
plt_mpg_vs_fcyl_by_fam <- 
  plt_mpg_vs_fcyl_by_fam + 
  geom_point()

# Show the graphic
plt_mpg_vs_fcyl_by_fam + ggtitle('Normal')

# G2
# Plot base
# Alter the point positions by jittering, width 0.3
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_jitter(width = 0.3)) + ggtitle('w/ Jitter')

#G3
# Now dodge the point positions
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_dodge(width=0.3)) + ggtitle('w/ Dodge')


#G4
# Now jitter and dodge the point positions
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_jitterdodge(jitter.width = 0.3, dodge.width=0.3))  + ggtitle('w/ Jitter and Dodge')

par(mfrow=c(1,1))

Notice that jitter can be a geom itself (i.e. geom_jitter()), an argument in geom_point() (i.e. position = "jitter"), or a position function, (i.e. position_jitter()).

Veja como usar o jitter e shape podem melhorar muito nossos gráficos e as análises.

par(mfrow=c(2,2))

require(car)

# Plot vocabulary vs. education
ggplot(Vocab, aes(x =education , y = vocabulary)) +
  # Add a point layer
  geom_point() + 
  ggtitle('Default')

# Replace the point layer with a jitter layer.
ggplot(Vocab, aes(education, vocabulary)) +
  # Change to a jitter layer
  geom_jitter() +
  ggtitle('w/ geom_jitter')

# Replace the point layer with a jitter layer and alpha = 0.2
ggplot(Vocab, aes(education, vocabulary)) +
  # Set the transparency to 0.2
  geom_jitter(alpha = 0.2) +
  ggtitle('w/ geom_jitter and alpha = 0.2')

# Replace the point layer with a jitter layer, alpha = 0.2 and shape =1
ggplot(Vocab, aes(education, vocabulary)) +
  # Set the shape to 1
  geom_jitter(alpha = 0.2, shape = 1) +
  ggtitle('w/ geom_jitter, alpha = 0.2 and shape = 0.1')

3.0.2 Podemos comparar os widths:

par(mfrow=c(2,2))
# G2.1
# Plot base
# Alter the point positions by jittering, width 0.3
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_jitter(width = 0.1)) + ggtitle('w/ Jitter - width = 0.1')

# G2.2
# Plot base
# Alter the point positions by jittering, width 0.3
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_jitter(width = 0.3)) + ggtitle('w/ Jitter - width = 0.3')


# G2.3
# Plot base
# Alter the point positions by jittering, width 0.3
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_jitter(width = 0.6)) + ggtitle('w/ Jitter - width = 0.6')

# G2.9
# Plot base
# Alter the point positions by jittering, width 0.3
plt_mpg_vs_fcyl_by_fam + geom_point(position = position_jitter(width = 0.9)) + ggtitle('w/ Jitter - width = 0.9')

par(mfrow=c(1,1))

Usando density:

datacamp_light_blue <- "#51A8C9"
# Using density
ggplot(mtcars, aes(mpg, y = ..density..)) +
  # Set the fill color to datacamp_light_blue
  geom_histogram(binwidth = 3.5, fill = datacamp_light_blue) + 
  # In order to see density use:
  geom_density(lwd = 0.5,
               linetype = 1,
               colour = 2)

3.0.3 Positions in histograms

Here, we’ll examine the various ways of applying positions to histograms. geom_histogram(), a special case of geom_bar(), has a position argument that can take on the following values:

  • stack (the default): Bars for different groups are stacked on top of each other.

  • dodge: Bars for different groups are placed side by side.

  • fill: Bars for different groups are shown as proportions.

  • identity: Plot the values as they appear in the dataset.

Observe the difference:

par(mfrow=c(2,2))

# Update the aesthetics so the fill color is by fam
ggplot(mtcars_mod, aes(mpg, fill = fam)) +
  geom_histogram(binwidth = 1, position = "identity") +
  ggtitle('Bar - identity')

# Change the position to dodge
ggplot(mtcars_mod, aes(mpg, fill = fam)) +
  geom_histogram(binwidth = 1, position = "dodge") +
  ggtitle('Bar - dodge')

# Change the position to fill
ggplot(mtcars_mod, aes(mpg, fill = fam)) + 
  geom_histogram(binwidth = 1, position = "fill") +
  ggtitle('Bar - fill')

# Change the position to identity, with transparency 0.4
ggplot(mtcars_mod, aes(mpg, fill = fam)) +
  geom_histogram(binwidth = 1, position = "identity", alpha = 0.4) +
  ggtitle('Bar - identity + alpha')

par(mfrow=c(1,1))

3.0.4 Barplot

3.0.4.1 Position in bar and col plots

Let’s see how the position argument changes geom_bar().

We have three position options:

  • stack: The default

  • dodge: Preferred

  • fill: To show proportions

While we will be using geom_bar() here, note that the function geom_col() is just geom_bar() where both the position and stat arguments are set to "identity". It is used when we want the heights of the bars to represent the exact values in the data.

par(mfrow=c(1,3))

## Plot fcyl, filled by fam - normal
ggplot(mtcars_mod, aes(x=fcyl, fill = fam)) +
  # Add a bar layer
  geom_bar() +
  ggtitle('Bar - identity')

## Plot fcyl, filled by fam - fill
ggplot(mtcars_mod, aes(fcyl, fill = fam)) +
  # Set the position to "fill"
  geom_bar(position='fill') +
  ggtitle('Bar - fill')

## Plot fcyl, filled by fam - dodge
ggplot(mtcars_mod, aes(fcyl, fill = fam)) +
  # Change the position to "dodge"
  geom_bar(position = "dodge") +
  ggtitle('Bar - dodge')

Bar plot with Var

# Calculate Descriptive Statistics:
iris_summ_long <- 
iris %>%   
  select(Species, Sepal.Width) %>%   
  gather(key, value, -Species) %>%   
  group_by(Species) %>%   
  summarise(avg = mean(value),             
            stdev = sd(value)) 


ggplot(iris_summ_long, aes(x = Species, y = avg, fill = datacamp_light_blue)) + 
  geom_col() +
  geom_errorbar(aes(ymin = avg - stdev, ymax = avg + stdev), width = 0.1)

3.0.5 Line Plots

Simple line graph:

link = 'https://assets.datacamp.com/production/repositories/5171/datasets/fd66a8c2408f8cccc24df8ce2668e0e195519532/fish.RData'

load(url(link))

# Ao importar os dados de um Rdata os dataset já são incorporados no drive

# Plot the Rainbow Salmon time series
ggplot(fish.species, aes(y = Rainbow, x = Year)) +
  geom_line()

3.0.5.1 Gráfico de linhas por grupos

# Plot multiple time-series by grouping by species
ggplot(fish.tidy, aes(x = Year, y = Capture)) +
  geom_line(aes(group = Species))

3.0.5.2 Gráfico de linhas por linetype

ggplot(fish.tidy, aes(x = Year, y = Capture, linetype = Species)) + 
  geom_line()

Veja que utilizando groups e linetype a visualização não ficou legal. Vamos tentar com color.

3.0.5.3 Gráfico de linhas por color

# Plot multiple time-series by coloring by species
ggplot(fish.tidy, aes(x = Year, y = Capture, color = Species)) +
  geom_line()

3.0.5.4 Gráfico Ribbon

ggplot(fish.tidy, aes(x = Year, y = Capture, fill = Species)) +
  geom_ribbon(aes(ymax = Capture, ymin = 0), alpha = 0.3)

3.0.5.5 Gráfico de Área

par(mfrow=c(1,2))

ggplot(fish.tidy, aes(x = Year, y = Capture, fill = Species)) +   
  geom_area() +
  ggtitle('Position default')

ggplot(fish.tidy, aes(x = Year, y = Capture, fill = Species)) +   
  geom_area(position='identity') +
  ggtitle('Position identity')

Observe que sem o parâmetro de position='identity' os gráficos se acumulam uma em cima do outro.

4 Themes

Cada aesthetic possuí

4.1 Moving the legend

Let’s wrap up this course by making a publication-ready plot communicating a clear message.

To change stylistic elements of a plot, call theme() and set plot properties to a new value. For example, the following changes the legend position.

p + theme(legend.position = new_value)

Here, the new value can be

  • "top", "bottom", "left", or "right'": place it at that side of the plot.

  • "none": don’t draw it.

  • c(x, y): c(0, 0) means the bottom-left and c(1, 1) means the top-right.

Em primeiro lugar, vamos produzir um gráfico padrão com ggplot. Veja:

plt_prop_unemployed_over_time <-
ggplot(fish.tidy, aes(x = Year, y = Capture, color = Species)) +
  geom_line()

plt_prop_unemployed_over_time

Remove de legend

# Remove legend entirely
plt_prop_unemployed_over_time +
  theme(legend.position = 'above')

legend.key : https://www.r-graph-gallery.com/239-custom-layout-legend-ggplot2.html

element.rect : https://ggplot2.tidyverse.org/reference/element.html

4.2 Modifying theme elements

Many plot elements have multiple properties that can be set. For example, line elements in the plot such as axes and gridlines have a color, a thickness (size), and a line type (solid line, dashed, or dotted). To set the style of a line, you use element_line(). For example, to make the axis lines into red, dashed lines, you would use the following.

p + theme(axis.line = element_line(color = "red", linetype = "dashed"))

Similarly, element_rect() changes rectangles and element_text() changes text. You can remove a plot element using element_blank().

Para aprendizados dos elementos vamos pintar com cores vibrantes, dessa forma será perceptível o que cada elemento faz:

Fundo de rosa e os elementos de legenda de verde:

plt_prop_unemployed_over_time +
  theme(
    # For all rectangles, set the fill color to pink
    rect =  element_rect(fill = 'pink'),
    # For the legend key, turn off the outline
    legend.key = element_rect(color = 'green')
  )

Axis the azul e os grids de verde.

plt_prop_unemployed_over_time +
  theme(
    rect = element_rect(fill = "grey92"),
    legend.key = element_rect(color = NA),
    # Turn off the panel grid -> you can use element_blank() in 
    panel.grid = element_line(color = 'green', size = 6),
        # Turn off axis ticks -> you can use element_blank() in 
    axis.ticks = element_line(color = 'blue', size = 6)
  )

Grid do painel de roxo e tipo pontilhado.

plt_prop_unemployed_over_time +
  theme(
    rect = element_rect(fill = "grey92"),
    legend.key = element_rect(color = NA),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    # Add major y-axis panel grid lines back
    panel.grid.major.y = element_line(
      # Set the color to white
      color = 'purple',
      # Set the size to 0.5
      size = 0.6,
      # Set the line type to dotted
      linetype  = 'dotted'
    )
  )

Texto dos axis de vemelho e título de laranja:

plt_prop_unemployed_over_time +
  ggtitle('Titulo do gráfico') + 
  theme(
    rect = element_rect(fill = "grey92"),
    legend.key = element_rect(color = NA),
    axis.ticks = element_blank(),
    panel.grid = element_blank(),
    panel.grid.major.y = element_line(
      color = "white",
      size = 0.5,
      linetype = "dotted"
    ),
    # Set the axis text color to grey25
    axis.text = element_text(color = 'red'),
    # Set the plot title font face to italic and font size to 16
   plot.title = element_text(
     size = 16,
     face = 'italic',
     color = 'orange'
   )
  )

4.3 Modifying whitespace

Whitespace means all the non-visible margins and spacing in the plot.

To set a single whitespace value, use unit(x, unit), where x is the amount and unit is the unit of measure.

Borders require you to set 4 positions, so use margin(top, right, bottom, left, unit). To remember the margin order, think TRouBLe.

The default unit is "pt" (points), which scales well with text. Other options include “cm”, “in” (inches) and “lines” (of text).

Give the axis tick length, axis.ticks.length, a unit of 2 “lines”.

plt_mpg_vs_wt_by_cyl <- 
  ggplot(mtcars_mod, aes(wt, mpg, color = fcyl)) +
  geom_point()

plt_mpg_vs_wt_by_cyl +
  theme(
    # Set the axis tick length to 2 lines
    axis.ticks.length = unit(2,'lines')
  )

Give the legend key size, legend.key.size, a unit of 3 centimeters (“cm”).

plt_mpg_vs_wt_by_cyl +
  theme(
    # Set the legend key size to 3 centimeters
    legend.key.size = unit(3, "cm")
  )

Set the legend.margin to 20 points (“pt”) on the top, 30 pts on the right, 40 pts on the bottom, and 50 pts on the left.

plt_mpg_vs_wt_by_cyl +
  theme(
    # Set the legend margin to (20, 30, 40, 50) points
    legend.margin = margin(20, 30, 40, 50, 'pt')
  )

Set the plot margin, plot.margin, to 10, 30, 50, and 70 millimeters (“mm”).

plt_mpg_vs_wt_by_cyl +
  theme(
    # Set the plot margin to (10, 30, 50, 70) millimeters
    plot.margin = margin(10,30,50,70, 'mm')
  )

4.4 Built-in themes

In addition to making your own themes, there are several out-of-the-box solutions that may save you lots of time.

par(mfrow = c(1,3))

# Add a void theme
plt_prop_unemployed_over_time +
  theme_void() + 
  ggtitle('Theme Void')

plt_prop_unemployed_over_time +
  theme_classic() + 
  ggtitle('Theme Classic')

plt_prop_unemployed_over_time +
  theme_bw() + 
  ggtitle('Theme Black')

Usando ggthemes

require(ggthemes)

# Use the fivethirtyeight theme
plt_prop_unemployed_over_time +
  theme_fivethirtyeight() + 
  ggtitle('theme_fivethirtyeight()')

# Use Tufte's theme
plt_prop_unemployed_over_time +
  theme_tufte() + 
  ggtitle('theme_tufte()')

# Use the Wall Street Journal theme
plt_prop_unemployed_over_time +
  theme_wsj() + 
  ggtitle('theme_wsj()')

4.5 Setting themes

Reusing a theme across many plots helps to provide a consistent style. You have several options for this.

  1. Assign the theme to a variable, and add it to each plot.

  2. Set your theme as the default using theme_set().

A good strategy that you’ll use here is to begin with a built-in theme then modify it.

# Save the theme as theme_recession
theme_recession <- theme(
  rect = element_rect(fill = "grey92"),
  legend.key = element_rect(color = NA),
  axis.ticks = element_blank(),
  panel.grid = element_blank(),
  panel.grid.major.y = element_line(color = "white", size = 0.5, linetype = "dotted"),
  axis.text = element_text(color = "grey25"),
  plot.title = element_text(face = "italic", size = 16),
  #legend.position = c(0.6, 0.1)
  legend.position = c(0.9,0.5)
)

# Combine the Tufte theme with theme_recession
theme_tufte_recession <- theme_tufte() + theme_recession

# Add the Tufte recession theme to the plot
plt_prop_unemployed_over_time +
theme_tufte_recession

theme_recession <- theme(
  rect = element_rect(fill = "grey92"),
  legend.key = element_rect(color = NA),
  axis.ticks = element_blank(),
  panel.grid = element_blank(),
  panel.grid.major.y = element_line(color = "white", size = 0.5, linetype = "dotted"),
  axis.text = element_text(color = "grey25"),
  plot.title = element_text(face = "italic", size = 16),
  legend.position = 'right'
)
theme_tufte_recession <- theme_tufte() + theme_recession

# Set theme_tufte_recession as the default theme
theme_tufte_recession <- theme_set(theme_tufte_recession)

# Draw the plot (without explicitly adding a theme)
plt_prop_unemployed_over_time

plt_prop_unemployed_over_time +
  theme_tufte() +
  theme(
    legend.position = "none",
    axis.ticks = element_blank(),
    axis.title = element_text(color = "grey60"),
    axis.text = element_text(color = "grey60"),
    # Set the panel gridlines major y values
    panel.grid.major.y = element_line(
      # Set the color to grey60
      color = 'grey60',
      # Set the size to 0.25
      size = 0.25,
      # Set the linetype to dotted
      linetype = 'dotted'
    )
  )

4.6 Geoms for explanatory plots

Let’s focus on producing beautiful and effective explanatory plots. In the next couple of exercises, you’ll create a plot that is similar to the one shown in the video using gm2007, a filtered subset of the gapminder dataset.

This type of plot will be in an info-viz style, meaning that it would be similar to something you’d see in a magazine or website for a mostly lay audience.

require(gapminder)

gm2007_complete <- gapminder %>% 
  filter(year == 2007) %>% 
  select(country, lifeExp, continent)

set.seed(5)
gm2007 <- sample_n(gm2007_complete, size = 30, replace = F)
# Add a geom_segment() layer
ggplot(gm2007, aes(x = lifeExp, y = country, color = lifeExp)) +
  geom_point(size = 4) +
  geom_segment(aes(xend = 30, yend = country), size = 2)

# Add a geom_text() layer
ggplot(gm2007, aes(x = lifeExp, y = country, color = lifeExp)) +
  geom_point(size = 4) +
  geom_segment(aes(xend = 30, yend = country), size = 2) +
  geom_text(aes(label = lifeExp), color = 'white', size = 1.5)

# Set the color scale
palette <- RColorBrewer::brewer.pal(5, "RdYlBu")[-(2:4)]

# Modify the scales
ggplot(gm2007, aes(x = lifeExp, y = country, color = lifeExp)) +
  geom_point(size = 4) +
  geom_segment(aes(xend = 30, yend = country), size = 2) +
  geom_text(aes(label = round(lifeExp,1)), color = "white", size = 1.5) +
  scale_x_continuous("", expand = c(0,0), limits = c(30,90), position = 'top') +
  scale_color_gradientn(colors = palette)

# Set the color scale
palette <- RColorBrewer::brewer.pal(5, "RdYlBu")[-(2:4)]

# Add a title and caption
plt_country_vs_lifeExp <- 
ggplot(gm2007, aes(x = lifeExp, y = reorder(country, +lifeExp), color = lifeExp)) +
  geom_point(size = 5.5) +
  geom_segment(aes(xend = 30, yend = country), size = 2) +
  geom_text(aes(label = round(lifeExp,1)), color = "white", size = 2) +
  scale_x_continuous("", expand = c(0,0), limits = c(30,90), position = "top") +
  scale_color_gradientn(colors = palette) +
  labs(
    title = 'Highest and lowest life expectancies, 2007',
  caption = 'Source: gapminder'
  )

plt_country_vs_lifeExp

Utilizamos a função reorder para ordernar na descendente. Se preferir na ascendente você deve usar “-” no lugar do “+”.

4.7 Annotate() for embellishments

In the previous exercise, we completed our basic plot. Now let’s polish it by playing with the theme and adding annotations. In this exercise, you’ll use annotate() to add text and a curve to the plot.

The following values have been calculated for you to assist with adding embellishments to the plot:

global_mean <- mean(gm2007_complete$lifeExp)
x_start <- global_mean + 4
y_start <- 5.5
x_end <- global_mean
y_end <- 7.5
# Define the theme
step_1_themes <- 
    theme_classic() +
    theme(axis.line.y = element_blank(),
          axis.ticks.y = element_blank(),
          axis.text = element_text(color = 'black'),
          axis.title = element_blank(),
          legend.position = 'none')

plt_country_vs_lifeExp + step_1_themes

# Add a vertical line
plt_country_vs_lifeExp +
  step_1_themes +
  geom_vline(xintercept = global_mean, 
             color = 'grey40', 
             linetype = 3)

# Add text
plt_country_vs_lifeExp +
  step_1_themes +
  geom_vline(xintercept = global_mean, 
             color = "grey40", 
             linetype = 3) +
  annotate(
    "text",
    x = x_start, y = y_start,
    label = "The\nglobal\naverage",
    vjust = 1, size = 3, color = "grey40"
  )

# Add a curve
plt_country_vs_lifeExp +  
  step_1_themes +
  geom_vline(xintercept = global_mean, 
             color = "grey40", 
             linetype = 3) +
  # Annotation
  annotate(
    "text",
    x = x_start, y = y_start,
    label = "The\nglobal\naverage",
    vjust = 1, size = 3, color = "grey40"
  ) +
  annotate(
    "curve",
    x = x_start, y = y_start,
    xend = x_end, yend = y_end,
    arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
    color = "grey40"
  )